Integration of Multiple Bilingually-Trained Segmentation Schemes into Statistical Machine Translation
نویسندگان
چکیده
منابع مشابه
Integration of Multiple Bilingually-Learned Segmentation Schemes into Statistical Machine Translation
This paper proposes an unsupervised word segmentation algorithm that identifies word boundaries in continuous source language text in order to improve the translation quality of statistical machine translation (SMT) approaches. The method can be applied to any language pair where the source language is unsegmented and the target language segmentation is known. First, an iterative bootstrap meth...
متن کاملBilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation
We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingual corpora and statistical word alignment techniques. First of all, our approach is adapted for t...
متن کاملEffects of Integrating Multiple Bilingually-Trained Segmentation Schemes for Japanese-English SMT
This paper proposes a method to integrate multiple segmentation schemes into a single statistical machine translation (SMT) system by characterizing the source language side and merging identical translation pairs of differently segmented SMT models. Experimental results translating Japanese into English revealed that the proposed method of integrating multiple segmentation schemes outperforms ...
متن کاملAn Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation
Attempts to estimate phrase translation probablities for statistical machine translation using iteratively-trained models have repeatedly failed to produce translations as good as those obtained by estimating phrase translation probablities from surface statistics of bilingual word alignments as described by Koehn, et al. (2003). We propose a new iteratively-trained phrase translation model tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2011
ISSN: 0916-8532,1745-1361
DOI: 10.1587/transinf.e94.d.690